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Abstract 

Social networks offer users new means of accessing information, essentially relying on "social 
filtering" , i.e. propagation and filtering of information by social contacts. The sheer amount 
of data flowing in these networks, combined with the limited budget of attention of each user, 
makes it difficult to ensure that social filtering brings relevant content to the interested users. 
Our motivation in this paper is to measure to what extent self-organization of the social network 
results in efficient social filtering. 

To this end we introduce flow games, a simple abstraction that models network formation 
under selfish user dynamics, featuring user-specific interests and budget of attention. In the con- 
text of homogeneous user interests, we show that selfish dynamics converge to a stable network 
structure (namely a pure Nash equilibrium) with close-to-optimal information dissemination. 

We show in contrast, for the more realistic case of heterogeneous interests, that convergence, 
if it occurs, may lead to information dissemination that can be arbitrarily inefficient, as captured 
by an unbounded "price of anarchy" . 

Nevertheless the situation differs when users' interests exhibit a particular structure, cap- 
tured by a metric space with low doubling dimension. In that case, natural autonomous dynam- 
ics converge to a stable configuration. Moreover, users obtain all the information of interest to 
them in the corresponding dissemination, provided their budget of attention is logarithmic in 
the size of their interest set. 

Keywords: Network formation, self organisation, budget of attention, price of anarchy, social 
filtering 



1 Introduction 

1.1 Motivation 

Information access has been revolutionized by the advent of social networks such as Facebook, 
Google-!- and Twitter. These platforms have brought about the new paradigm of "social filtering", 
whereby one accesses information by "following" social contacts. 

*Part of this work was done while at Technicolor, 
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This is especially true for twitter-like microb logging social networks. In such networks the func- 
tions of filtering, editing and disseminating news are totally distributed, in contrast to traditional 
news channels. The efficiency of social filtering is critically affected by the network topology, as 
captured by the contact-follower relationships. Today's networks provide recommendations to users 
for potentially useful contacts to follow, but don't interfere any further with topology formation. 
In this sense, these networks self-organize, under the selfish decisions of individual users. 

This begs the following question: when does such autonomous and selfish self-organizing topol- 
ogy lead to efficient information dissemination? The answer will in turn indicate under what cir- 
cumstances self-organization is insufficient, and thus when additional mechanisms, such as incentive 
schemes, should be introduced. 

Two parameters play a key role in this problem. On the one hand each user aims to maximize 
the coverage of the topics of her interest. On the other hand, a user pays with her attention: 
filtering interesting information from spam (i.e. information that does not fall in her topics of 
interest) incurs a cost. Users must therefore trade-off topic coverage against attention cost. As 
pointed out by Simon [20], as information becomes abundant, another resource becomes scarce: 
attention. 

Furthermore, there is an interplay between participants in a social network where filtering by 
one user may benefit another, inducing complex dependencies in decision on creating connections. 
To model this, we introduce a flow game in network formation where some users produce news 
about specific topics and each user is interested in receiving all news about a set of topics specific to 
her. Each user is a selfish agent that can choose its incoming connections within a certain budget 
of attention in order to maximize the coverage of her set of topics of interest. 

This model is of interest on its own, as it enriches the class of existing network formation 
games with a focus on flow dissemination. This model could also be of interest in the context of 
peer-to-peer streaming and file sharing or publish/subscribe applications. 

1.2 Our results 

We restrict ourselves to the case where each incoming link incurs a unitary attention cost and where 
the budget of attention of each user is captured by an integral value. This assumption is justified 
for almost all the results of the paper where users connect to users sharing at least a constant 
fraction of topics of interest (the cost of reading news arriving from various connections can then 
be considered to be roughly the same). The utility of a user is measured by the number of topics of 
her interest that she receives. We additionally assume that a user produces news about one topic 
at most even if she redistributes other topics. This is coherent with an empirical study of twitter 
traces [1] where it is shown that ordinary users (as opposed to celebrities or newspapers) can gain 
influence by concentrating on a single topic. Although we make several simplifying assumptions, 
we believe our model grasps sufficient complexity and tackles the main phenomena. 

We derive conditions where selfish dynamics (when each user independently and repeatedly 
tries to increase her own utility) converge to a pure Nash equilibrium, that is a state where no user 
can receive more topics by changing her connections while other users keep the same connections. 
(In the sequel we will simply speak of equilibrium for pure Nash equilibrium.) We then give 
approximation ratios bounding the quality of an equilibrium compared to an optimal solution. 
This is traditionally measured through the price of anarchy, that is the ratio of the global welfare 
(measured as the sum of user utilities) at an optimal solution compared to the global welfare at 
the worst equilibrium. 
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More precisely, we first consider the homogeneous case where all users are interested in the 
same set of topics. In this case, an optimal solution can easily be constructed by forming a ring 
between users with budget of attention at least 2. We show that the homogeneous game is not 
an exact potential game, that is, it does not admit an exact potential function as defined by [15] . 
(Equivalently it is not a congestion game [181 H5j .) In particular, this rules out the possibility of 
bounding the price of anarchy based on classical techniques using potential functions as described 
in [19] . Nevertheless, the game is an ordinary potential game (we exhibit a potential function 
that decreases under a selfish move) implying that selfish dynamics converge to an equilibrium in 
finite time. Additionally, we prove that the price of anarchy is bounded by 2 as soon as budget of 
attention is at least 3 for each user. More precisely, the price of anarchy is bounded by 1 + ^— 

where A is the average budget of attention. We indeed prove the stronger result that the utility 
of a user at equilibrium is within a constant factor (approaching 1 as A increases) of the maximal 
utility she can get in any configuration. 

We then consider the heterogeneous case where each user has her own set of topics of interests. 
The situation is then much more complex as computing an optimal solution may be NP-hard: one 
can easily reduce the max-cover problem to it. We leave open the question of determining if pure 
Nash equilibria always exist in this context. However, we exhibit particular configurations where 
highly inefficient topologies may arise, i.e. equilibria with linear price of anarchy. Thus in general 
there is scope for improving the performance of social networks by augmenting self-organization 
with suitable mechanisms. 

However, we show that this is not necessary when users' interests exhibit a particular structure, 
captured by a metric space. We consider the case where the topics form a subset of a metric 
space and where a user is interested in the topics falling in a given ball of the metric space. The 
ball can be seen as a specific domain of interest. (More realistically, we can consider that a user 
has several domains of interest by adapting the model for using a union of balls instead of just 
one ball.) With a slight modification to our model which is natural in this case, we can again 
exhibit an ordinal potential function implying that selfish dynamics converge to an equilibrium 
in finite time. Assuming that the metric space has low doubling dimension and similar related 
properties, we further show convergence to an equilibrium where users obtain all the information 
they are interested in, provided their budget of attention is logarithmic in the number of topics 
they are interested in. This case yields a price of anarchy of 1. The low doubling dimension occurs 
in particular when each user interests and each topic can be captured by a vector in a hidden 
euclidean space with small dimension, and where a user is interested in the topics whose vectors 
are within a certain distance from the user vector. 

1.3 Related work 

Information spread in networks has been studied extensively. Much of the past work study the 
properties of information diffusion on given networks with given sharing protocols. Our goal in this 
work is to study how networks form when users create connections with the objective of efficient 
content dissemination in a game-theoretical approach. This work is thus in a follow-up on the 
large amount of work in network formation games. However, to the best of our knowledge, the 
this objective of efficient information dissemination that we consider here is novel. We now discuss 
some work in those domains that are most relevant to this paper. 

Network formation games have been considered in previous work in economics and in the context 
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of the formation of Internet peering relations and peer-to-peer overlay networks. Economic models 
of network formation [11] use edges to represent social relations and it is typically assumed that 
the creation of an edge needs bilateral agreement since both users benefit from an edge. Our model 
is oriented and unilateral agreement is more relevant. 

Network creation games in the context of the Internet have been considered [16], where dis- 
tributed formation of undirected edges with a linear cost on each edge formed is studied. In such 
games, each user's objective is to minimize total formation cost while either minimizing distance 
to all other users [6], or ensuring connection to a given subset of nodes [2]. We consider a bound 
of edge costs, in the form of a limit on the number in-edges at each node, and further, we focus on 
connections that allow flows of information. 

Interestingly, bounded budget network formation games have already been considered. Bounded 
budget connection games |13] consider a bound on each user's budget in creating edges, with the 
objective being the minimization of the sum of weighted distances to other nodes. A similar 
model is considered in [3] where each user's objective is to maximize his influence, measured using 
betweenness centrality. In our work however, rather than minimizing distance to any node, we 
consider a formation game with the objective of ensuring connections to a subset of flows of interest. 

The notion of connecting to users that can provide a content flow of interest is similar to 
peer-to-peer live streaming systems [14] , Unlike peer-to-peer streaming, our model has download 
constraints (in the form of budget of attention) and we do not aim to satisfy flow rates, rather our 
aim is to connect to as many sets of relevant flows as possible. Moreover, our model allows differing 
user interests. 

To the best of our knowledge the only work considering content dissemination with some game- 
theoretical approach concerns the b-matching and acyclic preference systems studied in the context 
of peer-to-peer applications [9] . As a generalization of the stable marriages problem, those systems 
consider configurations of undirected edges based on mutual acceptance of an edge where unilateral 
decision is more suitable in our model. Our model is more intricate in the sense that connections 
are based not only on preferences (and affinity with other users) but also on complementarity of 
content obtained through various connections. 

In Section [5] we model the space of user interests by a metric space with low doubling dimension. 
Modeling interests of users through a metric space seems a natural approach and bounded growth 
metrics or more generally doubling metrics have shown to be very a general model [T7] that can 
grasps general situations still providing an algorithmic perspective. The doubling dimension extends 
the notion of dimension from Euclidean spaces to arbitrary metric spaces. It has proven to be 
useful in many application domains such as nearest neighbor queries to databases [5], network 
construction [I], closest server selction [12], etc. Doubling metrics have mainly been used to model 
distances in networks such as Internet [8]. 

1.4 Organization of the paper 

Section [2] introduces the model. We study the case of homogeneous interests in Section [3j We first 
bound the price of anarchy in Subsection 13.11 and then consider convergence under selfish dynamics 
and potential functions in Subsection [321 The heterogeneous case in its full generality is considered 
in Section U] which details some negative results. Section [5] is dedicated to the restrictive scenario 
where users' interests are captured by doubling metric, enabling some positive results. We finally 
conclude in Section [6] describing potential extensions of the current work. 
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2 Model 



We consider a social network where users interested in some set of content topics (or subjects) 
connect to (or follow in social networking parlance) other users in order to obtain such contents, 
materialized by flows of news. Each user produces news for at most one topic (but may forward 
news from other topics she is interested in). To distinguish the role of publisher from that of 
follower, we technically assume that news concerning a given topic (or subject) are produced at a 
given node called producer which is identified with that topic. 

A flow game is defined as a tuple (V, P, S, A) where V is a set of users, P a set of producers (or 
subjects or topics) and S : V — > P is a function associating to each user u its interest set S u C P, 
and A : V — > N is a function associating to each user u its budget of attention A u . We let n = \V\ 
and p = \P\ denote the number of users and producers respectively. A flow game is homogeneous 
if all users have same interest set: S u = P for all u G V. If this is not the case, the game is said to 
be heterogeneous. 

A strategy for user u is a subset F u of {(v, u) : v G V U P} such that \F U \ < A u (A u is an upper 
bound on the in-degree of u). For all (v,u) G F u , we say that u follows v or equivalently that u 
is connected to v. The collection F = {F u : u G V} forms a network defined by the directed graph 
G(F) = (V U P,E(F)) where E(F) = U u& yF u . A user u is interested in a subject s if s G S u . A 
user u receives a subject s G P if there exists a directed path from s to u in G(F) such that all 
intermediate nodes are interested in s. The utility U U (F) for user u is the number of subjects in 
S u she receives. The utility of u is maximized if U U (F) = \S U \. 

We denote by move, a shift from a collection F of strategies to a collection i* 1 ' where a single user 
it changes her strategy from a set F u to another i 7 ^. (We say that u rewires her connections.) The 
move is selfish if U U (F') > U U (F). Selfish dynamics (or dynamics for short) are the sequences of 
selfish moves. We say that dynamics converge if any sequence of selfish moves is necessarily finite. 
The network is at equilibrium (or stable) if no selfish move is possible. In standard game-theoretic 
terminology, this corresponds to a pure Nash equilibrium. (Dynamics converge when any sequence 
of selfish moves leads to an equilibrium.) The global welfare of the system is defined as the overall 
system utility: U = ^2 u€ y U u . The efficiency of selfish, self-organization of a game is classically 
captured by the notion of price of anarchy defined as the ratio of the optimal global welfare over 
the global welfare of the worst equilibrium: 

PoA = max Fe/E u gy u u(F) 
mm Fe£ J2 uGV U u (F) 

where T denotes the set of possible collection of strategies and 8 C T denotes the set of equilibria. 

An ordinary (or general 0) potential function [15] is a function / : T — > R such that 
sign(f(F') — f(F)) = sign(U u (F') — U U {F)) for any move from F to F' where user u changes 
her strategy. If f(F') — f(F) = U U (F') — U U (F), f is called an exact potential function. This 
notion was introduced by Monderer and Shapley [15] who show that it is tightly related to the 
notion of a congestion game [IS]. Potential functions are classically used to show convergence of 
dynamics and to bound price of anarchy 119] . 

3 Homogeneous interests 

In this section, we assume that all users have identical sets of interests, S u = P, for all u G V{G). 
In this context, we shall first establish an upper bound on the price of anarchy. We will then show 
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convergence of dynamics. 

3.1 Bounding the price of anarchy 

We first derive a simple upper bound on the overall system utility under an optimal centrally 
designed configuration. Clearly, any user u cannot achieve utility larger than p, which corresponds 
to obtaining all the subjects in P. Moreover, it cannot obtain more subjects than the aggregate 
budget of attention of all users, that is YlueV(G) wn i cn we a ^ so denote by nA. However, this 
bound is attained only when all users have budget of attention 1. When there are at least two users 
with budget at least 2 and less than p (we will restrict to that more interesting case) , one can easily 
see that the optimal solution consists in forming an oriented ring between users whose budget is 
at least 2 and then connecting budget 1 users to some user of the ring. All remaining connection 
are used to obtain distinct subjects. Each node then receives the same set of subjects. As each 
node connects to a non-producer, the number of subjects gathered is at most Ylu&v(G) ^« — 1- In 
that case (budget at least 2 and less than p for at least two users), we have thus established the 
following upper bound on the maximal utility U* a user can get: 

U* < min (p, n(A - 1)) . (1) 

We now consider a distributed setting where each user selfishly rewires her incoming connections 
if she can improve her utility, i.e., if this allows her to receive more subjects. The following 
proposition shows that with homogeneous user interests and budget of attention at least 3, self 
organization is efficient if dynamics converge, achieving a price of anarchy close to 1. 

Proposition 1 Assume that 3 < A u < p for every user u € V of an homogeneous flow game. 
Then for any equilibrium the utility of a user is at least =-jU* where U* is the optimal utility she 
can get. The price of anarchy is thus at most 1 + 1/(A — 2), approaching 1 for large A. 

We first note that the above theorem is tight in the sense that high price of anarchy can arise 
for A = 2 as shown in Figure [TJ In this particular case, a doubly linked chain forms a Nash 
equilibrium gathering only two subjects in total while an oriented cycle gathers n subjects. The 
price of anarchy is thus n/2. 

We now establish two lemmas before proving Proposition [TJ We first establish the existence of 
strongly connected components in any stable network. 

Lemma 1 If an equilibrium is reached such that there exists a path x,ui, . . . , life where x is a 
producer, has in- degree bound A Uk > 3 and a producer y is not received by Uk, then there is a 
path from Uk to u\ . 

Proof. Since A Ufc > 3, must be connected to two nodes v and w distinct from u^-i- We first 
claim that v must bring at least one unique subject z\, otherwise, could unfollow v and follow 
y instead. Similarly, w must bring at least one unique subject zi. Then if there is no path from Uk 
to ui, u\ would unfollow x and follow Uk instead, so that she only loses one subject x but gains at 
least two subjects z\ and z-i. □ 

The following Lemma aims at using the fact that users will tend to avoid redundant links at 
equilibrium. 
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(b) A Nash equilibrium configuration 
Figure 1: Homogeneous interest sets with degree A 
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Lemma 2 Consider a strongly connected graph G with n nodes and m arcs (multiple arcs are 
allowed). If m > 2n — 1, then G contains a transitivity arc (i.e. an arc (s,t) such that there exists 
a directed path from s to t). 

Proof. We prove the result by induction on n. The hypothesis is true for n = 1. We denote by 
n(G) the number of nodes in the graph G and by m(G) the number of edges in the graph G. Now 
consider n > 1 and assume that the property is true for any graph G' with n{G') < n. Consider 
a strongly connected graph G with n nodes containing no transitivity arc. Since n > 2, G must 
contain a circuit, i.e. an oriented cycle, with k > 2 nodes. The only arcs connecting two nodes 
of the circuit are the circuit arcs (otherwise, we would encounter a transitivity arc). Consider 
the graph G' obtained by contracting the circuit to one node. We have m(G') = m{G) — k and 
n(G') = n(G) — k + 1 < n. Note that G' does not contain a transitivity arc either. Our induction 
hypothesis thus implies that m{G') < 2n(G') — 1. That is m{G) — k < 2{n — k+1) — 1 or equivalently 
m{G) < 2n — k + 1 <2n — 1 as fc > 2. The property is thus satisfied for n. □ 

We are now ready to prove Proposition [TJ 
Proof, [of Proposition [T] Consider any equilibrium. Assume that some user u receives less than p 
subjects, u must be connected to some producer x by a path x, u±, . . . , = u (eventually k = 1). 
Consider the graph G' induced by users reachable from u\ that receive less than p subjects. By 
Lemma [H G' is strongly connected and all its users receive the same number p' < p of subjects. 

We claim that two users u and v of G' cannot follow the same producer y. As there exists a 
path from u to v, the link (y, v) would be redundant and v would follow some unreceived subject 
instead. Moreover, the fact that users in G' do not receive all subjects implies that they have spent 
all their budget of attention. We thus conclude m(G') = Yl u eV(G') ^ u ~ P'- ^ s network is 
stable, there is no transitivity arc in G' . Lemma [2] thus implies m(G') < 2n(G') — 2 < 2n(G'). We 

thus get p' > J2ueV(G>) A « - 2n (C) = J2ueV(G')( A u ~ 2 )- 

First consider the case p' < p — 2. Suppose there exists a user w £ V{G'). She cannot receive 
two subjects not received in G' otherwise u± would unfollow x and connect to w. As > 3, w 
can gather the p' subjects received in G' plus two others by connecting to one node in G' plus the 
two corresponding producers, a contradiction as this would increase her utility. We thus conclude 
that G' indeed contains all users, implying p' > n(A — 2). Using ([1]), the utility of each user is at 

least p' > U*. 
1 — A-l 

Finally, in all remaining cases to consider, all users receive at least p — 1 subjects. The utility 
of each user is thus at least ^y-U* > =f| U* as A - 1 < p. □ 

3.2 Convergence to equilibrium and potential functions 

We have thus shown that stable configurations of self-organizing networks with homogeneous user 
interests are efficient. However, do network dynamics converge to an equilibrium ? The following 
proposition answers this question in the affirmative. 

Proposition 2 Any homogeneous flow game has an ordinal potential function, implying that selfish 
dynamics always converge to an equilibrium in finite time. 

Proof. Let rii denote the number of users that receive i subjects and consider the the sequence 
(no,ni, . . . , rip). We show that this sequence always decreases according to lexicographic ordering 
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when users make selfish moves. The function — ^2 0<i<p rii n p ~ l is thus a potential function that 
will always increase until a local maximum is reached, proving convergence to an equilibrium. 

Consider a user u that is receiving % subjects and that will make a selfish move to receive j > i 
subjects instead. Note that there is no path from u to any other user receiving k < i subjects. 
Therefore any change by u will not affect these users. Now consider any user v with k > i subjects. 
If there is no path from u to v then tt's selfish move does not affect v. If there is such a path, then 
v will now receive at least j > % subjects. We thus now have — 1 users receiving i subjects, and 
the sequence (no, n\, . . . , n p ) has decreased according to lexicographic ordering. □ 

Our proof yields a very loose bound of n p+l on convergence time. We leave as an open question 
whether exponential time of convergence can really arise. However, we show that an homogeneous 
flow game with at least 4 subjects, a user with budget of attention at least 2 and a user with budget 
of attention at least 3, is not equivalent to a congestion game. This rules out the possibility of 
using techniques similar to [7] to find equilibria in polynomial time, and more generally to easily 
bound convergence time. 

To prove this, we show that the game does not admit an exact potential function (which is 
equivalent to not being equivalent to a congestion game |15j). To show this, it is sufficient to 
exhibit a 4-cycle in the strategy space such that the sum of utility variations over the 4 moves is 
non-zero. (The variation of an exact potential potential function along the cycle would obviously 
be zero and would also have to be equal to that sum, leading to a contradiction as shown more 
formally in |15j.) Without loss of generality, the game contains four producers {a, b, c, d} and two 
users u, v with A u > 2 and A„ > 3 as depicted in Figure User u can adopt in particular strategy 
A = {(a, u)} or B = {(b, u), (c, u)}. User v can adopt in particular strategy C = {(n, v), (b, v), (c, v) } 
or D = {(u,v),(d,v)}. Consider the cycle (A,C) {B,C) (B,D) (A,D) -> (A, C) where 
user u moves from strategy A to B increasing its utility by 1, then v moves from C to D and 
increases its utility by 1, then u moves back to A with a utility variation of -1, and finally v moves 
back to C increasing its utility by 1 again. The overall sum is thus 2^0. 
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Figure 2: A 4-cycle (A, C) ->• (B,C) ->■ {B,D) ->■ {A, D) ->• (A,C) in the strategy space. 

Combining Proposition Q] and Proposition [21 we obtain: 

Theorem 1 In an homogeneous flow game where each user has budget of attention at least 3, at 
most p, and A in average, selfish dynamics converge to an equilibrium such that the utility of a 
user is at least =^17* where U* is the optimal utility she can get, implying a price of anarchy of 
1 + 1/(A - 2) at most. 
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4 Heterogeneous interests 



We now consider the more realistic case where users have differing sets of interests. We assume 
user u is interested in a subset S u C P of subjects. We will show that the price of anarchy of such 
a system may be unbounded. 

Proposition 3 In an heterogeneous flow game, the price of anarchy can be arbitrarily large: specific 
choices yield a PoA of Q. (5). 

Proof. We show the result through an example, illustrated in Figure El For integer k, consider a 
system with n = 2k users having budget of attention A > 2 each, and p = 2(A — l)k producers. We 
distinguish two set of users {a\, . . . , a^} and {b\, . . . , b^}. Similarly, the producers are partitionned 
into groups {A\, . . . , and {B\, . . . , B^} where each Ai (resp. Bi) contains A — 1 producers. 



As illustrated in Figure 3(a) , each user is interested in Ai U Bi and additionally the first 
element of each Aj for j 7^ i. Similarly, each user bi is interested in Ai U Bi and additionally the 
first element of each Bj for j 7^ i. 

A benchmark configuration is shown in Figure [3(b)[ with two oriented rings, one for users <2j, 
i = 1, . . . , k and one for users bi, i = 1, . . . , k. User aj is connected to aj_i (with ao corresponding 
to dk) and to all producers in Ai. User bi is connected to (with 60 corresponding to bk) and to 
all producers in Bi. The corresponding utility is n(n/2 + A — 2), so that the optimal global welfare 
U* satisfies W >n 2 /2. 



On the other hand, the configuration shown in Figure 3(c) is an equilibrium, where each user 
ai (resp. bi) connects to producers in Ai (resp. Bi) and to 6, (resp. ai). The global utility here is 
U = n(2A — 2) < 2nA, and the price of anarchy is thus at least tt. □ 

We have shown that the price of anarchy can be unbounded with respect to the number of users 
in some cases. The question of determining if pure Nash equilibria exist is left open. We conjecture 
that selfish dynamics may not converge. 



5 Structured interest sets 

We now revisit the efficiency of social filtering in an heterogeneous scenario, where interest sets are 
no longer arbitrary but instead are organized according to a well behaved geometry. Specifically we 
assume the following model. A metric d is given on a set P' ~D P of subjects. The interest set S u 
of each user u then coincides with a ball B(s u ,R u ) in this metric, specified by a central subject s u 
and a radius of interest R u . Without loss of generality, we can assume P' = {s u : u G V} U P and 
S u = B(s u , R u ) H P. We shall first give conditions on the metric d and the sets S u under which an 
efficient configuration exists. We will then introduce modified dynamics and filtering rules which 
guarantee stability, i.e. convergence to an equilibrium. A flow game where interest sets can be 
defined in this way is called a metric flow game. 

The model can easily be generalized to more eclectic user interests where topics a user is 
interested in correspond to the disjoint union of a constant number of balls. We leave out the 
details of such generalizations so as to keep the focus of the paper. However, we include a brief 
discussion later in the section, in the context of Proposition |H 
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(a) Interest sets 
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(b) Benchmark configuration 




(c) A Nash equilibrium configuration 




Figure 3: Heterogeneous interest sets. 
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5.1 Sufficient conditions for optimal utility 

Consider the following properties of the interest set geometry. 

1. 7-doubling: d is 7-doubling, i.e. for any subject s and radius R, the ball B(s, R) can be covered 
by 7 balls of radius R/2: there exists I C S such that |/| < 7 and B(s, R) C Ut£iB(t, R/2). 

2. r-covering: r is a covering radius, i.e. any subject s £ P is at distance at most r from the 
central subject s u of some user u with interest radius R u > r. 

3. (r, <5)-sparsity: there are at most 8 subjects within distance r: \B(s,r)\ < 5 for all s. 

4. r-interest-radius regularity: for any users u, v with d(s u ,s v ) < 3R u /2 + r, we have R v > 
Ru/2 + r (users with similar interests have comparable interest radii). 

Property ([T]) is a classical generalization of dimension from Euclidean geometry to abstract 
metric spaces (an Euclidean space with dimension k is 2 e> ( fe )-doubling). This is a natural assumption 
if user interests can be modeled by proximity in a hidden low-dimensional space. Property ([2]) states 
that all subjects are within distance r from some user's center of interest and can thus be seen as 
an assumption of minimum density of users' interests over the whole set P of available subjects. 
Property (|3]) puts an upper bound on the density of subjects. In other words, we assume a level of 
granularity under which we do not distinguish subjects. Property ([!]) is another form of regularity 
assumption, requiring some smoothness in the radii of interests of nearby users. This may be the 
most debatable assumption, for instance if we consider the case of an expert next to an amateur. 
However, if we assume that a topic is split into several subjects according to the level of expertise 
required to understand the corresponding news, the assumption becomes more natural as an expert 
is still interested in related subjects (with lower level of understanding) and an amateur still has 
some focus if the correct number of levels is considered. 

We now show that an optimal solution exists, i.e. one in which each user receives all subjects 
in her interest set, as soon as her budget of attention A u satisfies A u > j8 + 7 2 log where R m 
is the maximum radius of interest over all users. This will be a direct consequence of the following 
proposition. 

Proposition 4 If a metric flow game satisfies the j-doubling, r-covering, (r,5)-sparsity and r- 
interest-radius regularity assumptions. If in addition each user u has a budget of attention at least 
7 <5 + 7 2 log -^f- , then there exists a collection of strategies such that each user u receives all subjects 
in S u . 

This result can easily extended to the case where each user interest set is given by a disjoint 
union of balls (the number of balls being at most a constant b). It suffices to repeat the construction 
of the proof for each ball, resulting in a factor b in the resulting required budget of attention. The 
assumptions have to be slightly modified so that any subject is covered by some ball of a user 
(in the covering assumption) and that two nearby balls have comparable radii (in the regularity 
assumption) . 

Proof. For each user u and each integer i > 0, define the ball B u ^ := B(s u ,mm(R u ,2 l r)). The 
construction to follow will ensure that u collects all subjects in B u ^ through a set N U i of contacts 
such that B Uji C \J v eN Uli Bv,i-l- 
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We first define N Uj i = {p s : s € B Ut i}. Now, for 2 < i < [log ^r] , the 7-doubling assumption 
implies that B u i can be covered by at most 7 2 balls of radius 2 % ~ 2 r: there exists a set L M j of at 
most 7 2 subjects such that B u ^ C U s< =l u i B(s,2' l ~ 2 r). From the r-covering assumption, we can 
then define a set N u ^ of at most 7 2 users such that each s € L u i is at distance at most r from 
some s v with v G N u ^. We then have B u ^ C U„ e 7v u i B(s v ,2 l ~ 2 r + r). Without loss of generality, we 
can assume that for each s € L u ^, B(s,2 l ~ 2 r) intersects B u i (otherwise s can safely be removed 
from L U) i as it does not cover anything useful). We thus have d(s u , s) < R u + 2 % ~ 2 r < 3R u /2 (note 
that 2 l ~ l r < R u as i < [log-^-]). For v S N u ^ such that d(s,s v ) < r, we then have d(s u ,s v ) < 
3R u /2 + r. From the r-interest-radius regularity, we then deduce R v > R u /2 + r > 2 l ~ 2 r + r, 
implying min(i?„, 2*~ 1 r) > 2*~ 2 r + r. The ball B v ^-\ thus contains B(s v , 2 % ~ 2 r + r) D -B(s, 2*~ 2 r). 
Together with the definition of L nj j, this proves C U„ e 7v u 

The connection graph G results from connecting each user u to all contacts in the set U 1<i< r log r^-^ N, t 

Flow correctness: We show by induction on i that each user u receives all subjects in B u ^. 
The direct connection to producers for subjects in B U) \ ensures this for i = 1. For i > 1, the 
induction hypothesis implies that each user v € N u ^ receives all subjects in B v ^_\. From B u ^ C 
U V ^N U iB V) i-\, we conclude that u will receive news about subjects in B u ^ from its contacts in N u ^. 
As S u = B u r log H„ -1 , we finally know that u receives all subjects in S u . 

In-degree bound: First, we have |iV U) i| < 7<5. This comes from the fact that B u \ is included 
in at most 7 balls of radius r from the 7-doubling assumption, and each of these balls contains at 
most 5 subjects from the (r, <5)-sparsity assumption. Second, we have already seen that \N U ^\ < 7 2 
for 2 < i < [log ^] . We thus obtain the bound 7^ + 7 2 ( [log - l) < jS + 7 2 log □ 

A set of 7 2 balls of radius 2* _1 r sufficient to cover a given ball radius of 2V can be computed 
through a simple greedy covering algorithm |10] . A solution where the required budget of attention 
is within a factor 7 from the bound of Proposition 2] can thus be computed in polynomial time. 

As previously mentioned, a budget of attention of A = 7<5+7 2 log ^ ak per user is thus enough for 
maximum utility. This scales logarithmically in R m , while under the assumptions of the theorem 
one can arrange interest sets to have size polynomial in R m (take for example interests to be 
regularly placed on a lattice). Thus this configuration gives substantial savings in comparison to 
one where users would connect directly to all their subjects. 

Clearly the configuration graph identified in this theorem is an equilibrium: as maximum utility 
is reached, no user can increase its utility by reconnecting. We now study conditions that guarantee 
convergence of selfish dynamics. 

5.2 Sufficient conditions for stability 

We first define two rules regarding republication of subjects received and reconnections. 

1. Expertise-filtering rule: when a user u is connected to a user v, u only receives subjects s 
such that d(s v ,s) < d(s u ,u). 

2. Nearest-subject rule for re-connection: when reconnecting, each user u gives priority to 
subjects that are closer to s u : a new subject s is gained by u so that no subject t with 
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d(s u ,t) < d(s u ,s) is lost. (On the other hand, any subject t with d(s u ,t) > d(s u ,s) can be 
lost.) 

Rule [I] can be interpreted as follows. The center of expertise of a user is the same as its center 
of interest, and the distance d also captures expertise of users about subjects, in that u is more 
expert than v on subject s if and only if d(s u ,s) < d(s v ,s). The rule then amounts to a sanity 
check where u discards news from sources that have less expertise than herself on the subject. We 
capture with the following slight variation of the model. A flow game with expertise- filtering is a 
flow game where reception of a subject s by user u occurs only when there exists a directed path 
s = uq, ■ ■ ■ ,Uf; = u from s to a such that for each 1 < i < k, s € S u . (i.e. d(s Ui , s) < R Ui ) and 
d{s Ui ,s) < d(s Ui+1 ,s) (expertise filtering). 

Rule [2] simply states that a user u prefers to receive a subject it is more interested in (i.e. closer 
to s u ) rather than any number of subjects that are less interesting. A flow game is denoted to be with 
nearest- subject priority if the utility function of each user u is defined by 
U U (F) = m&x{R : u receives all s G B(s u ,R)} (we simply choose a function naturally reflecting 
the rule). 

Proposition 5 Any metric flow game with expertise-filtering and nearest- subject priority has an 
ordinal potential function, implying that selfish dynamics always converge to an equilibrium in finite 
time. 



Proof. Consider the set T> = |<i(s,£) : s,t G P' j of all possible distances. Let n, . . . ,r m denote 
all elements of T> sorted in increasing order (i.e. r± < • • • < r m ). Let nj denote the number of pairs 
(n, s) such that d{s u ,s) = and u receives s. Consider the tuple (m,...,n m ). When a user u 
makes a selfish move, it increases its utility by receiving a new subject s. Let i denote the index such 
that d(s u ,s) = ri. Any lost subject t must satisfy d(s u ,t) > d(s u ,s) by the nearest-subject rule. If 
a lost subject t was received by some user v through a path from u to v, we have d(s v ,t) > d(s u ,t) 
by the expertise-filtering rule. We thus deduce d(s v ,t) > d(s u ,s), implying that nj can decrease 
only for j > i. The tuple (m, . . . ,n m ) thus increases according to the lexicographical order after 
any selfish move. The function ^o<i<m n i (n+p) 2 ^ m ~^ is thus a potential function that will always 
increase until a local maximum is reached, proving convergence to an equilibrium. □ 

Again the bound on convergence time implied by the above proof is very loose. We leave open 
the question of determining better bounds or faster convergence conditions. 
We are now ready to prove the following: 

Theorem 2 A metric flow game with expertise- filtering and nearest- subject priority satisfies the 7- 
doubling, r-covering, (r,5)-sparsity and r -interest-radius regularity assumptions. If in addition each 
user u has budget of attention at least ^5 + j 2 log — , selfish dynamics converge to an equilibrium 
where each user u receives all subjects in S u , implying that the price of anarchy is then 1. 

Proof. Consider a configuration where some users do not receive some subject in their interest ball. 
Let (u, s) be a user-subject unsatisfied pair such that d(s u ,s) is minimal. Consider the smallest 
integer i such that d(s u , s) < 2V holds. According to the construction of Proposition [U user u can 
receive all subjects in B u ^ = B(s u , mm(R u , 2V)) as long as every user v receives all subjects in her 
ball of radius min(i?„, 2 i_1 r) which is the case according to the choice of the pair (n, s). Note that 
this construction follows the expertise filtering rule as each subject at distance greater than 2 l ~ 1 r is 
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retrieved through a user at distance at most 2 J_1 r from the subject. User u can retrieve B u> i using 
at most 7<5 + 7 2 (i — 1) connections. The configuration is thus unstable as long as A u > r y5 + r y 2 {i — 1) 
which is the case for A u > ^5 + 7 2 log^r-. Since the system must stabilize to some equilibrium 
according to Proposition [5J every user u must receive all news about subjects in S u in that stable 
configuration. □ 



6 Concluding remarks 

We have shown that a flow game can have complex dynamics that may not converge. However, we 
can prove convergence to efficient equilibrium for both homogeneous flow games (with very weak 
assumptions) and metric flow games (with more technical assumptions). Direct follow up of this 
work concerns the study of the speed of convergence and the characterization of flow games having 
pure Nash equilibria. 

Our model makes several simplifying assumptions. We believe that several of them could 
be alleviated. A natural generalization would be to consider a real-valued cost of attention for 
establishing a link {v, u) instead of a unitary cost. The cost of establishing link (v, u) could typically 
be a function of S u and S v . A natural cost taking into account the attention required to filter out 

I S I 

uninteresting content would then be c(v,u) = g 1 | , for example. 

A dual variant of our model could be to consider that every user gathers all the subjects she 
is interested in while she tries to minimize the required cost of attention. We could also mix 
both models, using utility functions combining coverage of interest set and cost of attention (the 
function being increasing in the number of interesting subjects received and decreasing in the costs 
of attention of the formed links). 

In that context, we believe the two following directions are promising for efficient social dissem- 
ination. First, incentive mechanisms, e.g. reputation counters maintained by users, or payments 
between users, may considerably augment the performance of self-organizing social flows. Sec- 
ond, more elaborate content filtering between contact-follower pairs may also lead to substantial 
improvements. We have already introduced expertise filtering, which could translate into imple- 
mentable mechanisms in existing social networking platforms. More generally there appears to be 
a rich design space of filtering rules based on combinations of interests and expertise. 
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